CDB_cycles_AnalysisOfParameters

In this notebook, we are studying the different parameters of our method Complexity Driven Bagging so as to offer a range of selection to the final user. In particular, we have three parameters:

Besides these 3 parameters, we have obtained results for different complexity measures. For the analysis of the parameters, we have aggregated results over the different complexity measures.

First, we are studying, for each one of the parameters (alpha, split, number of cycles) independently, for which values there are no significant differences and, thus, can be eliminated from the range of recommended values.

In particular:

After this analysis, we will have a first range recommendation for every parameter. Notice that, in all cases, we take into account the mean, median and standard deviation of the accuracy.

Parameter analysis

Mean, median and standard deviation of accuracy for all levels of split

table_split <- datos %>%
  group_by(split) %>%
  summarise_at(vars(accuracy_mean_mean),  list(mean = mean, median = median, std = sd))
knitr::kable(table_split)
split mean median std
1 0.8092987 0.7959274 0.1147976
2 0.8107417 0.7951807 0.1148777
4 0.8112232 0.7938258 0.1148703
6 0.8113628 0.7934404 0.1149003
8 0.8116497 0.7930870 0.1147581
10 0.8117583 0.7931859 0.1147453
12 0.8118115 0.7930393 0.1146517
14 0.8118871 0.7928307 0.1147101
16 0.8120610 0.7928717 0.1145618
18 0.8119867 0.7931512 0.1146432
20 0.8121319 0.7934534 0.1145823
22 0.8119735 0.7930799 0.1146288
24 0.8122418 0.7928943 0.1145285
26 0.8121776 0.7932803 0.1146281
28 0.8123048 0.7934266 0.1145013
30 0.8122980 0.7932961 0.1145704

The higher the value of split, the higher the mean (with some exceptions) of accuracy, the lower the median and the lower the standard deviation. ¿Medium-low split values?

If we compare if there are significant differences among the different split values (once aggregated per n_cycle). We obtain that:

For the mean of the accuracy, there are no significant differences among:

  • 4 with 6, 6 with 8 and 12

  • 8 with 10,12,14, 22

  • 10 with 12, 14,16, 18, 22

  • 12 with 14, 18, 22

  • 14 with 16, 18, 20, 22

  • From 16 to 30, almost all comparisons are not significantly different –> maximum value of split should be 16

For the median of the accuracy, there are no significant differences among:

  • 4 with 6 and 10

  • 6 with 8, 10, 12, 14

  • 8 with 10, 12, 14, 16, 18, 20, 22

  • 10 with 12, 14, 16, 18, 20, 22

  • 12 with 14, 16, 18, 20, 22

  • 14 with 16, 18, 20, 22, 26

  • From 16 to 30, almost all comparisons are not significantly different –> maximum value of split should be 16

For the std of the accuracy, there are no significant differences among:

  • 4 with 6 and 8

  • 6 with 8, 10, 12

  • 8 with 10, 12, 14, 16, 20

  • 10 with 12, 14, 16, 20, 22, 30

  • 12 with 14, 16, 20, 22, 26, 30

  • From 14 to 30, almost all comparisons are not significantly different –> maximum value of split should be 14

Mean, median and standard deviation of accuracy for all levels of alpha

table_alpha <- datos %>%
  group_by(alpha) %>%
  summarise_at(vars(accuracy_mean_mean),  list(mean = mean, median = median, std = sd))
knitr::kable(table_alpha)
alpha mean median std
2 0.8112319 0.7943175 0.1147040
4 0.8112529 0.7939886 0.1147610
6 0.8111501 0.7942333 0.1146813
8 0.8110590 0.7941433 0.1146313
10 0.8108840 0.7941460 0.1146409
12 0.8105879 0.7941476 0.1148773
14 0.8105690 0.7945687 0.1149103
16 0.8103790 0.7938420 0.1149157
18 0.8106317 0.7938928 0.1147642
20 0.8104071 0.7941098 0.1149047

The higher the value of alpha, the lower the mean and the median of accuracy. The standard deviation keeps lower for low-medium values. –> Low-medium values of alpha. Lower than 12.

If we compare if there are significant differences among the different alpha values (once aggregated per n_cycle). We obtain that:

For the mean of the accuracy, there are ONLY significant differences among:

  • 2 with 10

  • 6 with 16

  • 8 with 16

  • 10 with 12, 14, 16, 18, 20

For the median of the accuracy, there are ONLY significant differences among:

  • 2 with 6, 8, 10

  • 10 with 16, 20

For the std of the accuracy, there are NO significant differences among:

  • 4 with 6 and 8

  • 6 with 8, 10, 12

  • 8 with 10, 12

  • From 10 to 20, almost all comparisons are not significantly different –> maximum value of alpha should be 10

Mean, median and standard deviation of accuracy for all levels of n_cycles (for some split values)

We cannot perform a summary of ‘n_cycle’ in general because the number of cycles depends on the value of split. Thus, we show some cases.

split = 1

table_split1 <- datos %>% filter(split == 1) %>%
  group_by(n_cycle) %>%
  summarise_at(vars(accuracy_mean_mean),  list(mean = mean, median = median, std = sd))
#knitr::kable(table_split1)

#datatable(table_split1)
## solo para html
#library(DT)

## Crear una tabla interactiva con paginación
#datatable(table_split1, 
#          options = list(pageLength = 15, # Muestra 15 filas por página
#                         lengthMenu = c(15, 30, 50, 100), # Opciones de filas por página
#                         autoWidth = TRUE))

The higher the number of cycles, the higher the mean, median of accuracy and the lower the standard deviation. For high values of cycles, the accuracy clearly stabilizes and there is no always a clear increase over time. For example, results with 89 cycles are better than with 100.

split = 2

table_split2 <- datos %>% filter(split == 2) %>%
  group_by(n_cycle) %>%
  summarise_at(vars(accuracy_mean_mean),  list(mean = mean, median = median, std = sd))
knitr::kable(table_split2)
n_cycle mean median std
1 0.7853971 0.7658850 0.1239979
2 0.7955934 0.7810546 0.1206018
3 0.8021948 0.7858371 0.1181474
4 0.8042215 0.7883482 0.1173910
5 0.8063374 0.7902878 0.1165199
6 0.8071462 0.7908371 0.1163221
7 0.8080506 0.7914324 0.1160103
8 0.8087847 0.7921925 0.1156557
9 0.8091661 0.7926405 0.1155370
10 0.8095782 0.7930813 0.1153168
11 0.8100086 0.7933066 0.1151377
12 0.8100990 0.7933759 0.1150841
13 0.8103538 0.7937993 0.1149684
14 0.8105714 0.7941767 0.1149330
15 0.8108942 0.7945402 0.1148323
16 0.8109253 0.7946515 0.1148393
17 0.8111479 0.7948427 0.1148419
18 0.8112732 0.7947423 0.1147802
19 0.8114087 0.7953675 0.1147639
20 0.8113834 0.7951807 0.1147448
21 0.8116881 0.7953344 0.1145390
22 0.8116573 0.7953354 0.1145920
23 0.8118696 0.7955154 0.1144297
24 0.8119069 0.7954218 0.1144594
25 0.8119956 0.7957162 0.1143784
26 0.8120004 0.7955846 0.1144020
27 0.8121277 0.7957797 0.1143006
28 0.8121187 0.7961178 0.1142760
29 0.8121081 0.7960402 0.1143503
30 0.8121352 0.7957588 0.1143345
31 0.8122648 0.7962517 0.1143523
32 0.8123752 0.7960415 0.1142631
33 0.8123683 0.7959932 0.1143042
34 0.8123711 0.7960135 0.1142772
35 0.8124318 0.7957797 0.1142546
36 0.8124696 0.7960139 0.1142538
37 0.8124844 0.7959146 0.1143172
38 0.8125643 0.7959839 0.1142551
39 0.8126494 0.7963088 0.1142045
40 0.8126500 0.7961395 0.1141759
41 0.8126006 0.7960233 0.1142213
42 0.8126436 0.7960113 0.1142012
43 0.8126348 0.7960226 0.1141702
44 0.8126457 0.7960123 0.1141959
45 0.8126805 0.7961793 0.1141843
46 0.8127261 0.7963512 0.1141605
47 0.8127523 0.7962641 0.1141475
48 0.8127691 0.7965876 0.1141673
49 0.8128038 0.7966290 0.1141481
50 0.8128280 0.7966724 0.1141625
51 0.8128212 0.7965620 0.1141359
52 0.8128713 0.7966063 0.1141631
53 0.8127992 0.7967943 0.1141847
54 0.8128502 0.7964785 0.1141343
55 0.8128538 0.7964524 0.1141290
56 0.8128305 0.7966296 0.1141977
57 0.8128659 0.7966053 0.1141948
58 0.8128923 0.7963933 0.1141678
59 0.8129489 0.7966044 0.1141921
60 0.8129136 0.7965633 0.1142095

The higher the number of cycles, the higher the mean, median of accuracy and the lower the standard deviation. For high values of cycles, the accuracy clearly stabilizes and there is no always a clear increase over time.

split = 4

table_split4 <- datos %>% filter(split == 4) %>%
  group_by(n_cycle) %>%
  summarise_at(vars(accuracy_mean_mean),  list(mean = mean, median = median, std = sd))
knitr::kable(table_split4)
n_cycle mean median std
1 0.7964150 0.7797189 0.1203589
2 0.8033216 0.7854461 0.1180312
3 0.8068885 0.7890164 0.1166376
4 0.8083118 0.7912175 0.1161173
5 0.8094067 0.7917305 0.1156737
6 0.8099889 0.7925033 0.1154870
7 0.8104188 0.7922330 0.1152974
8 0.8109216 0.7921687 0.1150842
9 0.8111544 0.7926198 0.1150265
10 0.8114268 0.7922984 0.1148640
11 0.8116804 0.7933066 0.1147552
12 0.8118249 0.7939326 0.1146853
13 0.8120731 0.7938902 0.1144945
14 0.8121403 0.7941411 0.1145572
15 0.8121389 0.7938028 0.1145616
16 0.8122687 0.7939090 0.1145346
17 0.8124095 0.7942513 0.1145056
18 0.8124360 0.7943176 0.1144360
19 0.8124641 0.7944077 0.1144784
20 0.8125574 0.7944574 0.1144006
21 0.8126189 0.7946072 0.1143768
22 0.8126793 0.7947791 0.1143619
23 0.8127204 0.7945826 0.1143730
24 0.8127888 0.7946916 0.1144135
25 0.8128566 0.7946926 0.1143158
26 0.8129487 0.7948461 0.1142807
27 0.8129609 0.7949083 0.1143199
28 0.8129670 0.7948229 0.1142952
29 0.8130106 0.7948855 0.1142529
30 0.8131037 0.7949547 0.1142461
31 0.8131281 0.7950469 0.1142050
32 0.8131329 0.7949534 0.1142309
33 0.8131985 0.7951973 0.1142136
34 0.8132277 0.7950469 0.1142080

The higher the number of cycles, the higher the mean, median of accuracy and the lower the standard deviation. For high values of cycles, the accuracy stabilizes but keeps showing an increasing trend. The longest the cycle, the less stable the trend (still increasing).

split = 10

table_split10 <- datos %>% filter(split == 10) %>%
  group_by(n_cycle) %>%
  summarise_at(vars(accuracy_mean_mean),  list(mean = mean, median = median, std = sd))
knitr::kable(table_split10)
n_cycle mean median std
1 0.8052708 0.7858695 0.1173541
2 0.8090385 0.7909640 0.1159272
3 0.8107343 0.7923280 0.1152991
4 0.8115094 0.7932952 0.1149629
5 0.8118858 0.7929970 0.1147897
6 0.8121280 0.7920622 0.1146498
7 0.8124481 0.7940731 0.1145583
8 0.8125687 0.7945700 0.1145587
9 0.8127579 0.7945227 0.1144735
10 0.8128867 0.7944792 0.1143970
11 0.8129266 0.7939709 0.1143859
12 0.8129714 0.7938801 0.1143991
13 0.8130340 0.7944180 0.1143002
14 0.8130330 0.7941496 0.1143587
15 0.8131818 0.7947400 0.1142817

The higher the number of cycles, the higher the mean, median of accuracy and the lower the standard deviation. For high values of cycles, the accuracy stabilizes but keeps showing an increasing trend. The longest the cycle, the less stable the trend (still increasing).

Number of cycles

# Tenemos que hacer el análisis para cada combo_alpha_split
valores_combo = levels(datos$combo_alpha_split)
n_combo = length(valores_combo)
combo_friedman = data.frame(valores_combo)
combo_friedman$p_value = rep(NA,n_combo)

for (i in valores_combo){
  #print(i)
  datos_i = datos[datos$combo_alpha_split==i,]
  fri = friedman.test(accuracy_mean_mean ~ n_cycle |Dataset, data=as.matrix(datos_i))
  combo_friedman[combo_friedman$valores_combo==i,2] = fri$p.value
}
combo_friedman[combo_friedman$p_value> 0.05]
data frame with 0 columns and 160 rows
# es decir, en todos los casos hay diferencias significativas

Once we have checked that there are significant differences between at least one value in the combo, we make multiple comparisons to analyze when adding another cycle is not worthy since the increase is not significant.

dif_no_sig <- data.frame(valores_combo)
dif_no_sig$niveles = rep(NA,n_combo)

# Lo dejamos en comentarios porque tarda mucho

# for (i in valores_combo){
#   print(i)
#   datos_i = datos[datos$combo_alpha_split==i,]
#   datos_i$n_cycle <- factor(datos_i$n_cycle) # los niveles del factor cambian en cada subset
#   pwc2 <- datos_i %>%
#     wilcox_test(accuracy_mean_mean ~ n_cycle, paired = TRUE, p.adjust.method = "bonferroni")
#   # Filtrar comparaciones con diferencias no significativas (suponiendo un umbral de p > 0.05)
#   no_significativas <- pwc2[pwc2$p.adj>0.1,]
# 
# 
#   # si no todas las comparaciones con ese nivel son no significativas, lo quitamos
#   # es decir, no nos vale que solo no haya diferencia entre 3 y 5 y con el resto (3-6,3-7,etc) sí
#   max_cycles = max(as.numeric(pwc2$group2))
#   valores_check <- unique(as.numeric(no_significativas$group1))
#   for (v in valores_check){
#     if (sum(no_significativas$group1 == v) <(max_cycles - v) ){
#       no_significativas = no_significativas[no_significativas$group1!=v,]
#     }
#   }
# 
#   # Extraer los niveles de los pares con diferencias no significativas
#   niveles_no_significativos <- unique(c(no_significativas$group1, no_significativas$group2))
#   dif_no_sig[dif_no_sig$valores_combo==i,2] = paste(niveles_no_significativos, collapse = ", ")
# }
# 
# write.csv(dif_no_sig, "CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean.csv")

In this dataframe we have, for every combination of alpha and split, the number of cycles with no significant difference between all of them.

dif_no_sig_mean <- read.csv('CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean.csv') 
head(dif_no_sig_mean)
  X   valores_combo
1 1  alpha10-split1
2 2 alpha10-split10
3 3 alpha10-split12
4 4 alpha10-split14
5 5 alpha10-split16
6 6 alpha10-split18
                                                                                                                                                                                                                                                                                                                          niveles
1 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                              6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                    4, 5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                     3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                            4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                   5, 6, 7, 8, 9

Let’s relate that with the number of models to have a first view of where to stop adding models. We create two different columns:

  • num_models: it relates the first (minimum) number of cycles that presents no significant differences with all its consecutive number of cycles with the number of models it implies.

  • num_models2: it is exactly the same concept as num_models but with the second value of number of cycles in case we want to be more conservative

# Variables to character
dif_no_sig_mean$niveles <- as.character(dif_no_sig_mean$niveles)
dif_no_sig_mean$valores_combo <- as.character(dif_no_sig_mean$valores_combo)

# Order the values 
dif_no_sig_mean$niveles <- sapply(strsplit(dif_no_sig_mean$niveles, ", "), function(x) {
  paste(sort(as.numeric(x)), collapse = ", ")
})

# Extraer el valor numérico después de "split" en la columna B
dif_no_sig_mean$valor_split <- as.numeric(gsub(".*split", "", dif_no_sig_mean$valores_combo))

# New columns with number of models
dif_no_sig_mean$num_models <- mapply(function(a, b) {
  min(as.numeric(strsplit(a, ", ")[[1]])) * (2*b +1)
}, dif_no_sig_mean$niveles, dif_no_sig_mean$valor_split)

# New columns with number of models (for the second value)
dif_no_sig_mean$num_models2 <- mapply(function(a, b) {
  valores <- sort(as.numeric(strsplit(a, ", ")[[1]])) 
  segundo_min <- ifelse(length(valores) > 1, valores[2], valores[1])  # Obtener el segundo mínimo o el primero si hay solo uno
  segundo_min * (2*b +1)
}, dif_no_sig_mean$niveles, dif_no_sig_mean$valor_split)

head(dif_no_sig_mean)
  X   valores_combo
1 1  alpha10-split1
2 2 alpha10-split10
3 3 alpha10-split12
4 4 alpha10-split14
5 5 alpha10-split16
6 6 alpha10-split18
                                                                                                                                                                                                                                                                                                                          niveles
1 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                              6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                    4, 5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                     3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                            4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                   5, 6, 7, 8, 9
  valor_split num_models num_models2
1           1         63          66
2          10        126         147
3          12        100         125
4          14         87         116
5          16        132         165
6          18        185         222

We perform the same analysis for the median and the standard deviation.

For the median:

# dif_no_sig_mean$niveles_mediana = rep(NA,n_combo)
# 
# # Lo dejamos en comentarios porque tarda mucho
# 
# for (i in valores_combo){
#   print(i)
#   datos_i = datos[datos$combo_alpha_split==i,]
#   datos_i$n_cycle <- factor(datos_i$n_cycle) # los niveles del factor cambian en cada subset
#   pwc2 <- datos_i %>%
#     wilcox_test(accuracy_mean_median ~ n_cycle, paired = TRUE, p.adjust.method = "bonferroni")
#   # Filtrar comparaciones con diferencias no significativas (suponiendo un umbral de p > 0.05)
#   no_significativas <- pwc2[pwc2$p.adj>0.1,]
# 
# 
#   # es decir, no nos vale que solo no haya diferencia entre 3 y 5 y con el resto (3-6,3-7,etc) sí
#   max_cycles = max(as.numeric(pwc2$group2))
#   valores_check <- unique(as.numeric(no_significativas$group1))
#   for (v in valores_check){
#     if (sum(no_significativas$group1 == v) <(max_cycles - v) ){
#       no_significativas = no_significativas[no_significativas$group1!=v,]
#     }
#   }
# 
#   # Extraer los niveles de los pares con diferencias no significativas
#   niveles_no_significativos <- unique(c(no_significativas$group1, no_significativas$group2))
# 
#  dif_no_sig_mean[dif_no_sig_mean$valores_combo==i,'niveles_mediana'] = paste(niveles_no_significativos, collapse = ", ")
# }
# 
# write.csv(dif_no_sig_mean, "CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean_median.csv")

For the standard deviation:

# dif_no_sig_mean$niveles_std = rep(NA,n_combo)
# 
# # Lo dejamos en comentarios porque tarda mucho
# 
# for (i in valores_combo){
#   print(i)
#   datos_i = datos[datos$combo_alpha_split==i,]
#   datos_i$n_cycle <- factor(datos_i$n_cycle) # los niveles del factor cambian en cada subset
#   pwc2 <- datos_i %>%
#     wilcox_test(accuracy_mean_std ~ n_cycle, paired = TRUE, p.adjust.method = "bonferroni")
#   # Filtrar comparaciones con diferencias no significativas (suponiendo un umbral de p > 0.05)
#   no_significativas <- pwc2[pwc2$p.adj>0.1,]
# 
# 
#   # si no todas las comparaciones con ese nivel son no significativas, lo quitamos
#   # es decir, no nos vale que solo no haya diferencia entre 3 y 5 y con el resto (3-6,3-7,etc) sí
#   max_cycles = max(as.numeric(pwc2$group2))
#   valores_check <- unique(as.numeric(no_significativas$group1))
#   for (v in valores_check){
#     if (sum(no_significativas$group1 == v) <(max_cycles - v) ){
#       no_significativas = no_significativas[no_significativas$group1!=v,]
#     }
#   }
# 
#   # Extraer los niveles de los pares con diferencias no significativas
#   niveles_no_significativos <- unique(c(no_significativas$group1, no_significativas$group2))
#  
#   dif_no_sig_mean[dif_no_sig_mean$valores_combo==i,'niveles_std'] = paste(niveles_no_significativos, collapse = ", ")
# }
# 
# write.csv(dif_no_sig_mean, "CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean_median_std.csv")

Now we relate the number of cycles with the number of ensembles for all statistical measures (mean, median, std):

dif_no_sig_all <- read.csv('CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean_median_std.csv') 
head(dif_no_sig_all)
  X.1 X   valores_combo
1   1 1  alpha10-split1
2   2 2 alpha10-split10
3   3 3 alpha10-split12
4   4 4 alpha10-split14
5   5 5 alpha10-split16
6   6 6 alpha10-split18
                                                                                                                                                                                                                                                                                                                          niveles
1 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                              6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                    4, 5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                     3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                            4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                   5, 6, 7, 8, 9
  valor_split num_models num_models2
1           1         63          66
2          10        126         147
3          12        100         125
4          14         87         116
5          16        132         165
6          18        185         222
                                                                                                                                                                                                                                                                                                                                          niveles_mediana
1 15, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 16, 17, 18, 100
2                                                                                                                                                                                                                                                                                                                4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                                         3, 5, 6, 7, 8, 9, 10, 11, 4, 12
4                                                                                                                                                                                                                                                                                                                          2, 3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                                              2, 3, 4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                                     3, 4, 5, 6, 7, 8, 9
                                                                                                                                                                                                                                                                                                                                                                                    niveles_std
1 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 5, 6, 100
2                                                                                                                                                                                                                                                                                                                                                         5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                                                                                     5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                                                                                   3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                                                                                          4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                                                                           3, 4, 5, 6, 7, 8, 9
# Variables to character
dif_no_sig_all$niveles_mediana <- as.character(dif_no_sig_all$niveles_mediana)
dif_no_sig_all$niveles_std <- as.character(dif_no_sig_all$niveles_std)
dif_no_sig_all$valores_combo <- as.character(dif_no_sig_all$valores_combo)

# Order the values 
dif_no_sig_all$niveles_mediana <- sapply(strsplit(dif_no_sig_all$niveles_mediana, ", "), function(x) {
  paste(sort(as.numeric(x)), collapse = ", ")
})

dif_no_sig_all$niveles_std <- sapply(strsplit(dif_no_sig_all$niveles_std, ", "), function(x) {
  paste(sort(as.numeric(x)), collapse = ", ")
})


# New columns with number of models
dif_no_sig_all$num_models_mediana <- mapply(function(a, b) {
  min(as.numeric(strsplit(a, ", ")[[1]])) * (2*b +1)
}, dif_no_sig_all$niveles_mediana, dif_no_sig_all$valor_split)

dif_no_sig_all$num_models_std <- mapply(function(a, b) {
  min(as.numeric(strsplit(a, ", ")[[1]])) * (2*b +1)
}, dif_no_sig_all$niveles_std, dif_no_sig_all$valor_split)

# New columns with number of models (for the second value)
dif_no_sig_all$num_models2_mediana <- mapply(function(a, b) {
  valores <- sort(as.numeric(strsplit(a, ", ")[[1]])) 
  segundo_min <- ifelse(length(valores) > 1, valores[2], valores[1])  # Obtener el segundo mínimo o el primero si hay solo uno
  segundo_min * (2*b +1)
}, dif_no_sig_all$niveles_mediana, dif_no_sig_all$valor_split)

dif_no_sig_all$num_models2_std <- mapply(function(a, b) {
  valores <- sort(as.numeric(strsplit(a, ", ")[[1]])) 
  segundo_min <- ifelse(length(valores) > 1, valores[2], valores[1])  # Obtener el segundo mínimo o el primero si hay solo uno
  segundo_min * (2*b +1)
}, dif_no_sig_all$niveles_std, dif_no_sig_all$valor_split)

# Sacamos tb el valor en ciclos
dif_no_sig_all$cycles_mean <- sapply(strsplit(dif_no_sig_all$niveles, ", "), function(x) {
  min(as.numeric(x))})

dif_no_sig_all$cycles_median <- sapply(strsplit(dif_no_sig_all$niveles_mediana, ", "), function(x) {
  min(as.numeric(x))})

dif_no_sig_all$cycles_std <- sapply(strsplit(dif_no_sig_all$niveles_std, ", "), function(x) {
  min(as.numeric(x))})


head(dif_no_sig_all)
  X.1 X   valores_combo
1   1 1  alpha10-split1
2   2 2 alpha10-split10
3   3 3 alpha10-split12
4   4 4 alpha10-split14
5   5 5 alpha10-split16
6   6 6 alpha10-split18
                                                                                                                                                                                                                                                                                                                          niveles
1 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                              6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                    4, 5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                     3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                            4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                   5, 6, 7, 8, 9
  valor_split num_models num_models2
1           1         63          66
2          10        126         147
3          12        100         125
4          14         87         116
5          16        132         165
6          18        185         222
                                                                                                                                                                                                                                                                                                                                          niveles_mediana
1 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                                                4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                                         3, 4, 5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                                          2, 3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                                              2, 3, 4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                                     3, 4, 5, 6, 7, 8, 9
                                                                                                                                                                                                                                                                                                                                                                                    niveles_std
1 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                                                                                         5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                                                                                     5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                                                                                   3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                                                                                          4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                                                                           3, 4, 5, 6, 7, 8, 9
  num_models_mediana num_models_std num_models2_mediana num_models2_std
1                 45             12                  48              15
2                 84            105                 105             126
3                 75            125                 100             150
4                 58             87                  87             116
5                 66            132                  99             165
6                111            111                 148             148
  cycles_mean cycles_median cycles_std
1          21            15          4
2           6             4          5
3           4             3          5
4           3             2          3
5           4             2          4
6           5             3          3
#write.csv(dif_no_sig_all, "CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean_median_std_num_models.csv")

We have performed multiple comparisons analysis for the number of cycles with respect to the mean, median and std of the accuracy and found the number of cycles above which there is no significant difference for each statistical measure. Now we have to take the maximum number of cycles over the three statistical measure to select a range of cycles. After that we can analyze which is the best cycle structure (for example, a high number of short cycles or a short number of long cycles).

dif_no_sig_all$max_num_cycles <- apply(X=dif_no_sig_all[,c('cycles_mean','cycles_median','cycles_std')], MARGIN=1, FUN=max)
dif_no_sig_all$max_num_models <- apply(X=dif_no_sig_all[,c('num_models','num_models_mediana','num_models_std')], MARGIN=1, FUN=max)

#write.csv(dif_no_sig_all, "CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean_median_std_num_models.csv")

If we analyze the number of models above which there are no significant different, we can see that the maximum value is 294 and quartile 75% is 180, implying that our maximum tested number of models (300) is ok and that lower number of models in an ensemble can obtain competitive accuracy results.

p<-ggplot(dif_no_sig_all, aes(x=max_num_models)) + 
  geom_histogram(color="black", fill="white")
p

summary(dif_no_sig_all$max_num_models)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   54.0   107.5   145.0   146.3   180.8   294.0 

From the study of alpha and split we know that:

  • From 16 to 30, almost all comparisons are not significantly different –> maximum value of split should be 16. Domain: [1, 2, 4, 6, 8, 10, 12, 14]

  • Maximum alpha value should be 10-12: Domain: [2, 4, 6, 8, 10]

Note that these ranges go inline with the previous study where we make no distinction about cycles.

Let’s filter now the previous information according to these ranges:

dif_no_sig_all$valor_alpha <- as.numeric(gsub("alpha([0-9]+)-split[0-9]+", "\\1", dif_no_sig_all$valores_combo))
# Filtrar el dataset para eliminar las filas donde alpha > 12 y split > 16
df_filtered <- dif_no_sig_all[(dif_no_sig_all$valor_alpha < 12 & dif_no_sig_all$valor_split < 16), ]
summary(df_filtered$max_num_models)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  55.00   89.25  112.00  111.17  125.00  203.00 

After filtering the not desired values for alpha and split, the maximum number of models where the significant differences stop is 200.

Este df_filtered son 40 filas, es decir, 40 combinaciones de alpha y split. Entonces vamos a estudiarlos 1 a 1. Hago un gráfico de la evolución del accuracy en cada caso y señalo cuando se supone que no hay diferencias significativas. Esto lo hacemos de nuevo agregado por medidas de complejidad y luego elijo un par y lo visualizo para ellas. Aquí estoy un poco perdida porque en general se ve que según aumenta el número de modelos, aumenta el accuracy pero supuestamente ya no de forma significativa y deberíamos guiarnos por eso. Yo creo que el orden es:

  1. Hacer estos gráficos de evolución del accuracy para los 40 casos y mirarlo para alguna medida de complejidad

  2. Seleccionar el mejor valor de accuracy para estos 40 casos

  3. Sacar conclusiones de la comparación de 1 y 2

  4. Para nuestro método, escoger el mejor valor de parámetros en 2 casos:

    1. alpha, split y n_cycles reducido en función de las comparaciones múltiples

    2. alpha y split reducido en función de las comparaciones múltiples y n_cycles sin reducir

    Comparar estas 2 versiones nuestras standard bagging y mixed bagging con el mismo número de parámetros

datos_alpha_s <- datos %>% filter(alpha<12, split <16) 
datos_alpha_s <- datos_alpha_s %>% group_by(alpha, split, n_cycle, n_ensemble) %>%
  summarise_at(vars(accuracy_mean_mean),  list(accuracy_mean_dataset_mean = mean, accuracy_mean_dataset_median = median, accuracy_mean_dataset_std = sd))

For each combination of alpha and split we are plotting, with an orange dot, the maximum accuracy achieved and, with a blue dot, the number of ensembles from which there are no significant differences. In this case, the first thing to outline is that the orange point is not achieved at the maximum number of models tried (aroung 300), meaning that not always more models imply better performance. The blue dot appears quite before.

datos_alpha_s_1 <- datos_alpha_s %>% filter(alpha==2, split==1)
datos_alpha_s_1 <- as.data.frame(datos_alpha_s_1)
datos_alpha_s_1$n_cycle <- as.numeric(as.character(datos_alpha_s_1$n_cycle))
datos_alpha_s_1$n_ensemble <- as.numeric(as.character(datos_alpha_s_1$n_ensemble))

idmax = which.max(datos_alpha_s_1$accuracy_mean_dataset_mean)
# max(datos_alpha_s_1$accuracy_mean_dataset_mean)
max_acc_ensemble = datos_alpha_s_1[idmax,'n_ensemble']
max_signifi = dif_no_sig_all[(dif_no_sig_all$valor_alpha == 2) & (dif_no_sig_all$valor_split == 1),'max_num_models'] 
# datos_alpha_s_1[datos_alpha_s_1$n_ensemble==max_signifi,'accuracy_mean_dataset_mean']


plot(datos_alpha_s_1$n_ensemble, datos_alpha_s_1$accuracy_mean_dataset_mean, type='l', xlab='n ensembles', ylab = 'accuracy mean', main ='alpha = 2, split =1')
points(max_acc_ensemble, datos_alpha_s_1$accuracy_mean_dataset_mean[datos_alpha_s_1$n_ensemble == max_acc_ensemble], col='darkorange1', pch=19)
points(max_signifi, datos_alpha_s_1$accuracy_mean_dataset_mean[datos_alpha_s_1$n_ensemble == max_signifi], col='blue', pch=19)

The accuracy associated to each point is:

# El máximo en cada caso es
print(paste('Accuracy blue dot:', round(datos_alpha_s_1[datos_alpha_s_1$n_ensemble==max_signifi,'accuracy_mean_dataset_mean'],4)))
[1] "Accuracy blue dot: 0.8118"
print(paste('Accuracy orange dot:', round(max(datos_alpha_s_1$accuracy_mean_dataset_mean),4)))
[1] "Accuracy orange dot: 0.8132"

Let’s now make the same graph over the 40 combinations of alpha and split.

Common legend

  • Orange point: It represents the number of ensembles with the highest accuracy value
  • Blue point: It indicates the maximum number of ensembles without significant differences with higher number of ensembles
  • X axis: Number of ensembles.
  • Y axis: Accuracy (average over dataset and complexity measures and cross validation)
df_ranking <- data.frame(df_filtered$valores_combo)
colnames(df_ranking) <- 'valores_combo'
df_ranking$valor_split <- df_filtered$valor_split
df_ranking$valor_alpha <- df_filtered$valor_alpha
df_ranking$max_total <- rep(NA,dim(df_ranking)[1])
df_ranking$max_no_signif <- rep(NA,dim(df_ranking)[1])

# Configuración de la cuadrícula (5 filas y 8 columnas)
par(mfrow = c(5, 8), mar = c(2, 2, 2, 1))

max_acc_max_ensemble = 0

# Bucles para alpha y split
for (alpha_value in c(2, 4, 6, 8, 10)) {
  for (split_value in c(1, 2, 4, 6, 8, 10, 12, 14)) {
    # Filtrar los datos por alpha y split
    datos_alpha_s_1 <- datos_alpha_s %>% filter(alpha == alpha_value, split == split_value)
    datos_alpha_s_1 <- as.data.frame(datos_alpha_s_1)
    datos_alpha_s_1$n_cycle <- as.numeric(as.character(datos_alpha_s_1$n_cycle))
    datos_alpha_s_1$n_ensemble <- as.numeric(as.character(datos_alpha_s_1$n_ensemble))
    
    # Encontrar el máximo
    idmax <- which.max(datos_alpha_s_1$accuracy_mean_dataset_mean)
    max_acc_ensemble <- datos_alpha_s_1[idmax, 'n_ensemble']
    # Guardamos para ranking
    df_ranking[(df_ranking$valor_alpha == alpha_value) & (df_ranking$valor_split == split_value),'max_total'] = max(datos_alpha_s_1$accuracy_mean_dataset_mean)

    # Cuántas veces el máximo accuracy se logra con el máximo número de modelos
    max_acc_max_ensemble = max_acc_max_ensemble + sum(max_acc_ensemble== max(datos_alpha_s_1[,'n_ensemble']))
    max_signifi <- dif_no_sig_all[(dif_no_sig_all$valor_alpha == alpha_value) & (dif_no_sig_all$valor_split == split_value), 'max_num_models']
    # Guardamos para ranking
    df_ranking[(df_ranking$valor_alpha == alpha_value) & (df_ranking$valor_split == split_value),'max_no_signif'] = max(datos_alpha_s_1[datos_alpha_s_1$n_ensemble <= max_signifi,'accuracy_mean_dataset_mean'])
    
    # Graficar
    plot(datos_alpha_s_1$n_ensemble, datos_alpha_s_1$accuracy_mean_dataset_mean, type = 'l', 
         xlab = 'n ensembles', ylab = 'accuracy mean', main = paste('alpha =', alpha_value, 'split =', split_value))
    
    # Añadir los puntos correspondientes
    points(max_acc_ensemble, datos_alpha_s_1$accuracy_mean_dataset_mean[datos_alpha_s_1$n_ensemble == max_acc_ensemble], col = 'darkorange1', pch = 19)
    points(max_signifi, datos_alpha_s_1$accuracy_mean_dataset_mean[datos_alpha_s_1$n_ensemble == max_signifi], col = 'blue', pch = 19)
  }
}

# Restablecer los parámetros gráficos
par(mfrow = c(1, 1))

From the total of 40 combinations, in 16 of them the maximum accuracy is obtained at the maximum number of models tested.

The same plot but with equal y axis to enable fair comparisons.

# Configuración de la cuadrícula (5 filas y 8 columnas)
par(mfrow = c(5, 8), mar = c(2, 2, 2, 1))

# Bucles para alpha y split
for (alpha_value in c(2, 4, 6, 8, 10)) {
  for (split_value in c(1, 2, 4, 6, 8, 10, 12, 14)) {
    # Filtrar los datos por alpha y split
    datos_alpha_s_1 <- datos_alpha_s %>% filter(alpha == alpha_value, split == split_value)
    datos_alpha_s_1 <- as.data.frame(datos_alpha_s_1)
    datos_alpha_s_1$n_cycle <- as.numeric(as.character(datos_alpha_s_1$n_cycle))
    datos_alpha_s_1$n_ensemble <- as.numeric(as.character(datos_alpha_s_1$n_ensemble))
    
    # Encontrar el máximo
    idmax <- which.max(datos_alpha_s_1$accuracy_mean_dataset_mean)
    max_acc_ensemble <- datos_alpha_s_1[idmax, 'n_ensemble']
    max_signifi <- dif_no_sig_all[(dif_no_sig_all$valor_alpha == alpha_value) & (dif_no_sig_all$valor_split == split_value), 'max_num_models']
    
    # Graficar
    plot(datos_alpha_s_1$n_ensemble, datos_alpha_s_1$accuracy_mean_dataset_mean, type = 'l', 
         xlab = 'n ensembles', ylab = 'accuracy mean', main = paste('alpha =', alpha_value, 'split =', split_value),ylim=c(0.810,0.8153))
    
    # Añadir los puntos correspondientes
    points(max_acc_ensemble, datos_alpha_s_1$accuracy_mean_dataset_mean[datos_alpha_s_1$n_ensemble == max_acc_ensemble], col = 'darkorange1', pch = 19)
    points(max_signifi, datos_alpha_s_1$accuracy_mean_dataset_mean[datos_alpha_s_1$n_ensemble == max_signifi], col = 'blue', pch = 19)
  }
}

# Restablecer los parámetros gráficos
par(mfrow = c(1, 1))

Low split values with hight alpha values is not recommended (split 1 with alpha = 6,8,10, split 2 with alpha 10). They obtain the lower accuracy performances. –> omit split = 1 for the range of recommended parameters.

For the rest of values, similar patterns are found.

We now plot all the lines in the same plot. The one obtaining clearly lower accuracy values is split = 1 and alpha = 10. The rest of values are visually super close and moving in a really close range. This indicates that any combination of the parameters is adequate.

datos_alpha_s$n_ensemble <- as.numeric(as.character(datos_alpha_s$n_ensemble))
datos_alpha_s$accuracy_mean_dataset_mean <- as.numeric(as.character(datos_alpha_s$accuracy_mean_dataset_mean))

p <- plot_ly()

for (alpha_value in c(2, 4, 6, 8, 10)) {
  for (split_value in c(1, 2, 4, 6, 8, 10, 12, 14)) {
    datos_alpha_s_1 <- datos_alpha_s %>% filter(alpha == alpha_value, split == split_value)
    p <- p %>%
      add_lines(x = datos_alpha_s_1$n_ensemble, 
                y = datos_alpha_s_1$accuracy_mean_dataset_mean, 
                name = paste("alpha =", alpha_value, "split =", split_value), 
                line = list(width = 2),
                hovertemplate = paste('Alpha: ', alpha_value, 
                                    ' Split:', alpha_value,
                                    '<br>N ensemble:', datos_alpha_s_1$n_ensemble,
                                    '<br>Accuracy:', round(datos_alpha_s_1$accuracy_mean_dataset_mean,4),
                                    '<extra></extra>'))
  }
}

p <- p %>%
  layout(title = 'All combinations of alpha and split',
         xaxis = list(title = 'n ensembles'),
         yaxis = list(title = 'accuracy mean'),
         legend = list(title = list(text = 'Legend')))

p

Let’s now obtain a ranking according to the maximum accuracy (orange point) and another according to the maximum accuracy obtain before no significant differences.

df_ranking_order <- df_ranking  %>% arrange(desc(max_total))
df_ranking_order_sig <- df_ranking  %>% arrange(desc(max_no_signif))
df_ranking$max_total_order = rank(-df_ranking$max_total)
df_ranking$max_no_signif_order = rank(-df_ranking$max_no_signif)
knitr::kable(df_ranking %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha8-split8 8 8 0.8137686 0.8128798 1 2
alpha10-split4 4 10 0.8137428 0.8125417 2 6
alpha8-split14 14 8 0.8137000 0.8125644 3 5
alpha6-split12 12 6 0.8136999 0.8131655 4 1
alpha10-split10 10 10 0.8136539 0.8126133 5 4
alpha4-split6 6 4 0.8136040 0.8123885 6 7
alpha2-split2 2 2 0.8135073 0.8109553 7 34
alpha10-split14 14 10 0.8135064 0.8120707 8 15
alpha6-split4 4 6 0.8134667 0.8120063 9 20
alpha4-split8 8 4 0.8134268 0.8120263 10 19
alpha6-split2 2 6 0.8133845 0.8120406 11 18
alpha4-split2 2 4 0.8133718 0.8111269 12 33
alpha10-split8 8 10 0.8133598 0.8122037 13 12
alpha8-split4 4 8 0.8133313 0.8120054 14 21
alpha6-split8 8 6 0.8133265 0.8121895 15 13
alpha4-split12 12 4 0.8132805 0.8116634 16 29
alpha6-split10 10 6 0.8132663 0.8119715 17 23
alpha4-split4 4 4 0.8132581 0.8118172 18 26
alpha8-split2 2 8 0.8132480 0.8109033 19 35
alpha10-split6 6 10 0.8132340 0.8117819 20 28
alpha2-split12 12 2 0.8132313 0.8119486 21 25
alpha2-split4 4 2 0.8132291 0.8120416 22 17
alpha2-split6 6 2 0.8132224 0.8122963 23 10
alpha10-split12 12 10 0.8132146 0.8123468 24 9
alpha2-split1 1 2 0.8131823 0.8117885 25 27
alpha8-split10 10 8 0.8131610 0.8120655 26 16
alpha4-split14 14 4 0.8131100 0.8123629 27 8
alpha4-split1 1 4 0.8130823 0.8106599 28 37
alpha6-split14 14 6 0.8130712 0.8128054 29 3
alpha2-split14 14 2 0.8130681 0.8115990 30 30
alpha6-split6 6 6 0.8130336 0.8119757 31 22
alpha8-split12 12 8 0.8129563 0.8120811 32 14
alpha2-split10 10 2 0.8129444 0.8122379 33 11
alpha8-split6 6 8 0.8128888 0.8119604 34 24
alpha10-split2 2 10 0.8128475 0.8114556 35 32
alpha2-split8 8 2 0.8128245 0.8115054 36 31
alpha4-split10 10 4 0.8127088 0.8108926 37 36
alpha8-split1 1 8 0.8124528 0.8102501 38 38
alpha6-split1 1 6 0.8123563 0.8101563 39 39
alpha10-split1 1 10 0.8115700 0.8093746 40 40
cor.test(df_ranking$max_total_order, df_ranking$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking$max_total_order and df_ranking$max_no_signif_order
S = 4826, p-value = 0.0003214
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.5472795 
cor.test(df_ranking$max_total_order, df_ranking$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking$max_total_order and df_ranking$max_no_signif_order
t = 4.0309, df = 38, p-value = 0.0002576
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.2842258 0.7337049
sample estimates:
      cor 
0.5472795 

Both rankings do not have a high correlation but they generally agree in terms of what gets the worst results (alpha high and split low). The lack of agreement on which is best can be interpreted as meaning that different combinations of parameters work well.

For now, it is not clear what is better: lots of short cycles or few of long cycles. It seems that an intermediate point between those two states is adequate.

Analysis per Complexity Measure

Now, we repeat the same analysis but for each complexity measure.

#setwd("/home/carmen/PycharmProjects/EnsemblesComplexity/Results_general_algorithm_cycles")
# Data disaggregated per complexity measures
datos_CM <- read.csv('df_summary_CM.csv') 
str(datos_CM)
'data.frame':   29880 obs. of  8 variables:
 $ weights             : chr  "CLD" "CLD" "CLD" "CLD" ...
 $ n_cycle             : int  1 1 1 1 1 1 1 1 1 1 ...
 $ n_ensemble          : int  2 2 2 2 2 2 2 2 2 2 ...
 $ alpha               : int  2 4 6 8 10 12 14 16 18 20 ...
 $ split               : int  1 1 1 1 1 1 1 1 1 1 ...
 $ accuracy_mean_mean  : num  0.771 0.774 0.77 0.768 0.769 ...
 $ accuracy_mean_median: num  0.737 0.753 0.741 0.75 0.73 ...
 $ accuracy_mean_std   : num  0.131 0.132 0.132 0.133 0.13 ...
# Como en python empezamos en 0, tenemos que sumar 1 a n_ensemble
datos_CM$n_ensemble <- datos_CM$n_ensemble + 1
# Convert id and time into factor variables
datos_CM <- datos_CM %>%
  convert_as_factor(weights, n_cycle,n_ensemble)
datos_CM_filtro <- datos_CM %>% filter(alpha<12, split <16) 
# No hace falta agregar más porque este dataset ya está agregado en origen

For the information about significant differences, we use the values obtain in general. That is, we are not repeating the multiple comparisons analyses per each complexity measure.

plot_2max_grid_with_ranking <- function(CM,df_filtered,dif_no_sig_all,datos_CM){
  df_ranking_CM <- data.frame(df_filtered$valores_combo)
colnames(df_ranking_CM) <- 'valores_combo'
df_ranking_CM$valor_split <- df_filtered$valor_split
df_ranking_CM$valor_alpha <- df_filtered$valor_alpha
df_ranking_CM$max_total <- rep(NA,dim(df_ranking_CM)[1])
df_ranking_CM$max_no_signif <- rep(NA,dim(df_ranking_CM)[1])

# Configuración de la cuadrícula (5 filas y 8 columnas)
par(mfrow = c(5, 8), mar = c(2, 2, 2, 1))

max_acc_max_ensemble = 0

# Bucles para alpha y split
for (alpha_value in c(2, 4, 6, 8, 10)) {
  for (split_value in c(1, 2, 4, 6, 8, 10, 12, 14)) {
    # Filtrar los datos por alpha y split
    datos_CM_case <- datos_CM %>% filter(weights == CM,
                                         alpha == alpha_value, split == split_value)
    datos_CM_case <- as.data.frame(datos_CM_case)
    datos_CM_case$n_cycle <- as.numeric(as.character(datos_CM_case$n_cycle))
    datos_CM_case$n_ensemble <- as.numeric(as.character(datos_CM_case$n_ensemble))
    
    # Encontrar el máximo
    idmax <- which.max(datos_CM_case$accuracy_mean_mean)
    max_acc_ensemble <- datos_CM_case[idmax, 'n_ensemble']
    # Guardamos para ranking
    df_ranking_CM[(df_ranking_CM$valor_alpha == alpha_value) & (df_ranking_CM$valor_split == split_value),'max_total'] = max(datos_CM_case$accuracy_mean_mean)
    
    # Cuántas veces el máximo accuracy se logra con el máximo número de modelos
    max_acc_max_ensemble = max_acc_max_ensemble + sum(max_acc_ensemble== max(datos_CM_case[,'n_ensemble']))
    max_signifi <- dif_no_sig_all[(dif_no_sig_all$valor_alpha == alpha_value) & (dif_no_sig_all$valor_split == split_value), 'max_num_models']
    # Guardamos para ranking
    df_ranking_CM[(df_ranking_CM$valor_alpha == alpha_value) & (df_ranking_CM$valor_split == split_value),'max_no_signif'] = max(datos_CM_case[datos_CM_case$n_ensemble <= max_signifi,'accuracy_mean_mean'])
    
    # Graficar
    plot(datos_CM_case$n_ensemble, datos_CM_case$accuracy_mean_mean, type = 'l', 
         xlab = 'n ensembles', ylab = 'accuracy mean', main = paste('alpha =', alpha_value, 'split =', split_value),ylim=c(0.805,0.818))
    
    # Añadir los puntos correspondientes
    points(max_acc_ensemble, datos_CM_case$accuracy_mean_mean[datos_CM_case$n_ensemble == max_acc_ensemble]+0.0003, col = 'darkorange1', pch = 19)
    points(max_signifi, datos_CM_case$accuracy_mean_mean[datos_CM_case$n_ensemble == max_signifi], col = 'blue', pch = 19)
  }
}

# Restablecer los parámetros gráficos
par(mfrow = c(1, 1))
return(list(df_ranking_CM = df_ranking_CM,max_acc_max_ensemble = max_acc_max_ensemble))
}
plot_all_combinations <- function(CM,datos_CM_filtro){
  datos_CM_filtro$n_ensemble <- as.numeric(as.character(datos_CM_filtro$n_ensemble))
datos_CM_filtro$accuracy_mean_mean <- as.numeric(as.character(datos_CM_filtro$accuracy_mean_mean))

p <- plot_ly()

for (alpha_value in c(2, 4, 6, 8, 10)) {
  for (split_value in c(1, 2, 4, 6, 8, 10, 12, 14)) {
    datos_CM_case <- datos_CM_filtro %>% filter(weights == CM,
      alpha == alpha_value, split == split_value)
    p <- p %>%
      add_lines(x = datos_CM_case$n_ensemble, 
                y = datos_CM_case$accuracy_mean_mean, 
                name = paste("alpha =", alpha_value, "split =", split_value), 
                line = list(width = 2),
                hovertemplate = paste('Alpha: ', alpha_value, 
                                    ' Split:', split_value,
                                    '<br>N ensemble:', datos_CM_case$n_ensemble,
                                    '<br>Accuracy:', round(datos_CM_case$accuracy_mean_mean,4),
                                    '<extra></extra>'))
  }
}

p <- p %>%
  layout(title = paste(CM,': All combinations of alpha and split'),
         xaxis = list(title = 'n ensembles'),
         yaxis = list(title = 'accuracy mean'),
         legend = list(title = list(text = 'Legend')))

p
}

CLD

CM = 'CLD'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 5 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 6856, p-value = 0.02437
alternative hypothesis: true rho is not equal to 0
sample estimates:
     rho 
0.356848 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 2.3548, df = 38, p-value = 0.0238
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.05100891 0.60149436
sample estimates:
     cor 
0.356848 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha2-split2 2 2 0.8170067 0.8119554 1 30
alpha6-split8 8 6 0.8164724 0.8128871 2 22
alpha4-split2 2 4 0.8163923 0.8118918 3 31
alpha6-split4 4 6 0.8161981 0.8147587 4 3
alpha2-split1 1 2 0.8158910 0.8135514 5 13
alpha8-split2 2 8 0.8157094 0.8107373 6 34
alpha10-split8 8 10 0.8155802 0.8127223 7 23
alpha8-split8 8 8 0.8155570 0.8139353 8 9
alpha6-split12 12 6 0.8153948 0.8153948 9 1
alpha2-split12 12 2 0.8153240 0.8138678 10 11
alpha8-split14 14 8 0.8152281 0.8143362 11 7
alpha4-split1 1 4 0.8151386 0.8118221 12 32
alpha6-split10 10 6 0.8151157 0.8139530 13 8
alpha4-split10 10 4 0.8151077 0.8122986 14 28
alpha2-split10 10 2 0.8150844 0.8150844 15 2
alpha6-split6 6 6 0.8150701 0.8133809 16 15
alpha4-split6 6 4 0.8150327 0.8144878 17 6
alpha6-split2 2 6 0.8150215 0.8130135 18 19
alpha2-split8 8 2 0.8149203 0.8131022 19 18
alpha8-split12 12 8 0.8149007 0.8122999 20 27
alpha4-split12 12 4 0.8146885 0.8136674 21 12
alpha6-split14 14 6 0.8146647 0.8146647 22 4
alpha2-split14 14 2 0.8146582 0.8121991 23 29
alpha4-split4 4 4 0.8146255 0.8139241 24 10
alpha2-split6 6 2 0.8145971 0.8129641 25 20
alpha6-split1 1 6 0.8145562 0.8101187 26 39
alpha8-split10 10 8 0.8145504 0.8129534 27 21
alpha10-split14 14 10 0.8145420 0.8125098 28 25
alpha10-split12 12 10 0.8144953 0.8144953 29 5
alpha8-split4 4 8 0.8144661 0.8104014 30 38
alpha10-split10 10 10 0.8144590 0.8134577 31 14
alpha2-split4 4 2 0.8143867 0.8133121 32 17
alpha10-split2 2 10 0.8140532 0.8106446 33 36
alpha10-split4 4 10 0.8137194 0.8133667 34 16
alpha10-split6 6 10 0.8136665 0.8104432 35 37
alpha4-split14 14 4 0.8136351 0.8124670 36 26
alpha4-split8 8 4 0.8135236 0.8126300 37 24
alpha8-split1 1 8 0.8131522 0.8107084 38 35
alpha8-split6 6 8 0.8125105 0.8111202 39 33
alpha10-split1 1 10 0.8113378 0.8083389 40 40

DCP

CM = 'DCP'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 12 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 3624, p-value = 6.424e-06
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.6600375 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 5.4161, df = 38, p-value = 3.597e-06
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.4387357 0.8058565
sample estimates:
      cor 
0.6600375 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha6-split12 12 6 0.8147103 0.8136477 1 2
alpha6-split10 10 6 0.8144093 0.8114648 2 22
alpha8-split10 10 8 0.8139975 0.8113694 3 23
alpha4-split12 12 4 0.8139335 0.8114653 4 21
alpha10-split8 8 10 0.8139266 0.8128619 5 8
alpha4-split14 14 4 0.8138976 0.8138976 6 1
alpha4-split6 6 4 0.8138080 0.8124640 7 12
alpha10-split12 12 10 0.8137957 0.8128915 8 7
alpha2-split6 6 2 0.8137577 0.8130594 9 5
alpha8-split12 12 8 0.8136205 0.8134695 10 3
alpha8-split8 8 8 0.8135090 0.8124147 11 13
alpha10-split10 10 10 0.8134586 0.8116237 12 20
alpha10-split6 6 10 0.8134325 0.8129717 13 6
alpha4-split4 4 4 0.8133329 0.8117402 14 19
alpha10-split14 14 10 0.8133130 0.8127365 15 9
alpha2-split10 10 2 0.8132891 0.8107793 16 27
alpha8-split14 14 8 0.8132440 0.8106366 17 29
alpha2-split2 2 2 0.8131651 0.8107160 18 28
alpha2-split12 12 2 0.8131103 0.8130995 19 4
alpha6-split6 6 6 0.8130948 0.8119806 20 17
alpha10-split4 4 10 0.8129710 0.8118737 21 18
alpha2-split4 4 2 0.8129460 0.8122897 22 15
alpha6-split4 4 6 0.8128217 0.8105099 23 30
alpha10-split2 2 10 0.8128146 0.8096225 24 37
alpha2-split14 14 2 0.8127683 0.8100325 25 35
alpha6-split14 14 6 0.8127666 0.8127218 26 10
alpha4-split2 2 4 0.8127212 0.8111535 27 25
alpha8-split6 6 8 0.8126942 0.8126942 28 11
alpha8-split4 4 8 0.8125527 0.8123460 29 14
alpha4-split10 10 4 0.8125382 0.8101914 30 33
alpha4-split8 8 4 0.8125367 0.8120263 31 16
alpha6-split8 8 6 0.8124704 0.8109615 32 26
alpha2-split8 8 2 0.8122937 0.8104892 33 31
alpha8-split2 2 8 0.8118116 0.8088298 34 39
alpha6-split2 2 6 0.8113261 0.8113261 35 24
alpha2-split1 1 2 0.8112334 0.8100022 36 36
alpha4-split1 1 4 0.8111423 0.8091420 37 38
alpha8-split1 1 8 0.8109833 0.8102332 38 32
alpha6-split1 1 6 0.8106142 0.8101910 39 34
alpha10-split1 1 10 0.8097165 0.8079750 40 40

F1

CM = 'F1'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 10 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 6670, p-value = 0.01789
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.3742964 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 2.4882, df = 38, p-value = 0.01734
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.0710741 0.6141928
sample estimates:
      cor 
0.3742964 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha2-split8 8 2 0.8145392 0.8115605 1 20
alpha2-split12 12 2 0.8144666 0.8106633 2 30
alpha10-split14 14 10 0.8144357 0.8105123 3 32
alpha4-split8 8 4 0.8143072 0.8113034 4 22
alpha2-split2 2 2 0.8139121 0.8127721 5 4
alpha8-split2 2 8 0.8139063 0.8107802 6 28
alpha2-split4 4 2 0.8138354 0.8123443 7 8
alpha4-split6 6 4 0.8137402 0.8124732 8 7
alpha4-split12 12 4 0.8136563 0.8115991 9 19
alpha6-split8 8 6 0.8136179 0.8130388 10 2
alpha2-split6 6 2 0.8135237 0.8123212 11 9
alpha10-split10 10 10 0.8134848 0.8122740 12 10
alpha6-split2 2 6 0.8134833 0.8128000 13 3
alpha8-split14 14 8 0.8134690 0.8119110 14 14
alpha4-split4 4 4 0.8133041 0.8118787 15 16
alpha2-split10 10 2 0.8132534 0.8114831 16 21
alpha4-split1 1 4 0.8132299 0.8090499 17 39
alpha10-split4 4 10 0.8131868 0.8121041 18 12
alpha8-split8 8 8 0.8131475 0.8131475 19 1
alpha2-split14 14 2 0.8131095 0.8102393 20 33
alpha10-split6 6 10 0.8130989 0.8121760 21 11
alpha10-split8 8 10 0.8129562 0.8101262 22 35
alpha6-split6 6 6 0.8129060 0.8106589 23 31
alpha8-split6 6 8 0.8128675 0.8107632 24 29
alpha10-split2 2 10 0.8127789 0.8125516 25 6
alpha4-split14 14 4 0.8127036 0.8127036 26 5
alpha8-split10 10 8 0.8124341 0.8120072 27 13
alpha6-split10 10 6 0.8123895 0.8117557 28 17
alpha4-split2 2 4 0.8123711 0.8093017 29 38
alpha8-split4 4 8 0.8123309 0.8109876 30 26
alpha8-split12 12 8 0.8123302 0.8111266 31 25
alpha10-split12 12 10 0.8123077 0.8117054 32 18
alpha6-split12 12 6 0.8122357 0.8118946 33 15
alpha6-split4 4 6 0.8122205 0.8112718 34 23
alpha2-split1 1 2 0.8121696 0.8109076 35 27
alpha4-split10 10 4 0.8120861 0.8097717 36 37
alpha6-split1 1 6 0.8120093 0.8083411 37 40
alpha6-split14 14 6 0.8119268 0.8112652 38 24
alpha10-split1 1 10 0.8116933 0.8102062 39 34
alpha8-split1 1 8 0.8115319 0.8099124 40 36

Hostility

CM = 'Hostility'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 3 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 5942, p-value = 0.004583
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.4425891 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 3.0425, df = 38, p-value = 0.00424
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.1520419 0.6627278
sample estimates:
      cor 
0.4425891 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha2-split1 1 2 0.8168037 0.8137780 1 14
alpha10-split10 10 10 0.8164441 0.8147962 2 3
alpha4-split2 2 4 0.8159181 0.8139483 3 12
alpha6-split2 2 6 0.8158157 0.8136721 4 16
alpha2-split6 6 2 0.8158069 0.8133697 5 20
alpha6-split14 14 6 0.8157487 0.8157487 6 1
alpha8-split14 14 8 0.8157018 0.8130325 7 23
alpha4-split6 6 4 0.8153792 0.8139061 8 13
alpha8-split8 8 8 0.8153602 0.8132562 9 22
alpha2-split12 12 2 0.8153208 0.8127042 10 26
alpha10-split6 6 10 0.8153202 0.8125840 11 28
alpha6-split12 12 6 0.8153135 0.8150494 12 2
alpha10-split2 2 10 0.8153093 0.8127008 13 27
alpha10-split4 4 10 0.8153018 0.8113297 14 38
alpha10-split8 8 10 0.8152675 0.8133793 15 19
alpha6-split4 4 6 0.8152648 0.8124711 16 30
alpha4-split1 1 4 0.8151955 0.8142191 17 8
alpha10-split14 14 10 0.8151795 0.8143744 18 6
alpha4-split8 8 4 0.8151152 0.8143455 19 7
alpha8-split6 6 8 0.8151148 0.8139486 20 11
alpha8-split2 2 8 0.8150418 0.8128073 21 25
alpha10-split12 12 10 0.8149884 0.8135886 22 17
alpha8-split4 4 8 0.8148831 0.8134877 23 18
alpha8-split10 10 8 0.8148682 0.8132699 24 21
alpha2-split10 10 2 0.8148585 0.8146820 25 4
alpha2-split14 14 2 0.8145951 0.8120284 26 33
alpha8-split1 1 8 0.8145323 0.8116835 27 36
alpha4-split4 4 4 0.8145079 0.8128430 28 24
alpha8-split12 12 8 0.8144676 0.8144238 29 5
alpha6-split6 6 6 0.8144368 0.8140087 30 10
alpha2-split2 2 2 0.8143956 0.8117464 31 35
alpha4-split14 14 4 0.8143091 0.8125763 32 29
alpha2-split4 4 2 0.8142897 0.8137697 33 15
alpha4-split10 10 4 0.8142000 0.8110554 34 39
alpha6-split8 8 6 0.8140883 0.8140883 35 9
alpha6-split10 10 6 0.8139869 0.8119314 36 34
alpha4-split12 12 4 0.8136268 0.8122984 37 31
alpha2-split8 8 2 0.8133821 0.8120486 38 32
alpha6-split1 1 6 0.8132397 0.8114526 39 37
alpha10-split1 1 10 0.8130067 0.8103577 40 40

kDN

CM = 'kDN'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 12 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 5692, p-value = 0.002699
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.4660413 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 3.2471, df = 38, p-value = 0.002439
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.1807764 0.6789790
sample estimates:
      cor 
0.4660413 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha8-split6 6 8 0.8147422 0.8138763 1 2
alpha8-split8 8 8 0.8145975 0.8142340 2 1
alpha10-split2 2 10 0.8145920 0.8130155 3 7
alpha4-split8 8 4 0.8145505 0.8126739 4 12
alpha2-split4 4 2 0.8143902 0.8118260 5 21
alpha10-split8 8 10 0.8143356 0.8127684 6 10
alpha4-split2 2 4 0.8141905 0.8103841 7 37
alpha8-split10 10 8 0.8140803 0.8132863 8 5
alpha6-split1 1 6 0.8140776 0.8110029 9 32
alpha6-split8 8 6 0.8139449 0.8122894 10 15
alpha4-split6 6 4 0.8138990 0.8109489 11 33
alpha10-split4 4 10 0.8138941 0.8119121 12 20
alpha8-split4 4 8 0.8138367 0.8127963 13 9
alpha8-split2 2 8 0.8137070 0.8125466 14 13
alpha10-split12 12 10 0.8136363 0.8136363 15 3
alpha6-split10 10 6 0.8135921 0.8120103 16 18
alpha2-split1 1 2 0.8135071 0.8120134 17 17
alpha6-split2 2 6 0.8134892 0.8117951 18 22
alpha10-split14 14 10 0.8134518 0.8134518 19 4
alpha6-split12 12 6 0.8134512 0.8123098 20 14
alpha10-split6 6 10 0.8134355 0.8127365 21 11
alpha8-split14 14 8 0.8134229 0.8107666 22 35
alpha8-split12 12 8 0.8132940 0.8112317 23 31
alpha4-split1 1 4 0.8132675 0.8098249 24 39
alpha10-split10 10 10 0.8132451 0.8113676 25 28
alpha2-split14 14 2 0.8131401 0.8114086 26 25
alpha2-split10 10 2 0.8130711 0.8130711 27 6
alpha4-split4 4 4 0.8130709 0.8109018 28 34
alpha6-split4 4 6 0.8130310 0.8113718 29 26
alpha6-split14 14 6 0.8129780 0.8129780 30 8
alpha2-split2 2 2 0.8129191 0.8102802 31 38
alpha6-split6 6 6 0.8128726 0.8113598 32 29
alpha4-split10 10 4 0.8126771 0.8097656 33 40
alpha4-split14 14 4 0.8126645 0.8119789 34 19
alpha4-split12 12 4 0.8124764 0.8121818 35 16
alpha8-split1 1 8 0.8124742 0.8107599 36 36
alpha2-split8 8 2 0.8124223 0.8116371 37 23
alpha2-split6 6 2 0.8123787 0.8113194 38 30
alpha10-split1 1 10 0.8123200 0.8114989 39 24
alpha2-split12 12 2 0.8120971 0.8113713 40 27

LSC

CM = 'LSC'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 3 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 5604, p-value = 0.002221
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.4742964 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 3.3211, df = 38, p-value = 0.001989
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.1910081 0.6846502
sample estimates:
      cor 
0.4742964 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha6-split2 2 6 0.8156958 0.8135153 1 4
alpha8-split1 1 8 0.8152436 0.8113056 2 31
alpha4-split1 1 4 0.8151646 0.8113802 3 30
alpha10-split10 10 10 0.8151370 0.8142957 4 1
alpha6-split1 1 6 0.8149074 0.8104559 5 38
alpha10-split6 6 10 0.8148603 0.8138281 6 3
alpha10-split2 2 10 0.8148394 0.8129602 7 9
alpha10-split4 4 10 0.8148016 0.8132584 8 7
alpha8-split6 6 8 0.8145733 0.8127895 9 11
alpha2-split12 12 2 0.8143889 0.8131028 10 8
alpha6-split12 12 6 0.8143311 0.8138959 11 2
alpha10-split8 8 10 0.8141521 0.8133867 12 6
alpha6-split4 4 6 0.8141442 0.8125848 13 13
alpha4-split12 12 4 0.8141189 0.8110734 14 36
alpha10-split12 12 10 0.8141102 0.8119607 15 22
alpha2-split2 2 2 0.8140368 0.8116427 16 25
alpha8-split8 8 8 0.8139935 0.8134549 17 5
alpha6-split8 8 6 0.8138947 0.8115815 18 26
alpha10-split1 1 10 0.8138755 0.8112875 19 32
alpha8-split2 2 8 0.8138549 0.8114592 20 28
alpha2-split4 4 2 0.8138195 0.8120708 21 17
alpha6-split10 10 6 0.8137241 0.8128841 22 10
alpha2-split1 1 2 0.8133330 0.8120971 23 16
alpha4-split4 4 4 0.8131666 0.8126618 24 12
alpha4-split8 8 4 0.8131353 0.8120382 25 19
alpha8-split14 14 8 0.8131281 0.8120232 26 20
alpha8-split4 4 8 0.8130225 0.8120621 27 18
alpha8-split12 12 8 0.8129857 0.8121759 28 15
alpha4-split10 10 4 0.8129063 0.8114541 29 29
alpha4-split6 6 4 0.8128994 0.8109701 30 37
alpha4-split2 2 4 0.8128740 0.8125076 31 14
alpha2-split14 14 2 0.8126306 0.8120040 32 21
alpha2-split8 8 2 0.8125772 0.8101316 33 40
alpha8-split10 10 8 0.8125589 0.8115034 34 27
alpha4-split14 14 4 0.8125100 0.8110980 35 34
alpha6-split6 6 6 0.8122107 0.8104127 36 39
alpha10-split14 14 10 0.8122076 0.8111400 37 33
alpha2-split6 6 2 0.8118625 0.8118625 38 23
alpha6-split14 14 6 0.8118220 0.8118220 39 24
alpha2-split10 10 2 0.8114987 0.8110956 40 35

N1

CM = 'N1'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 6 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 4638, p-value = 0.000189
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.5649156 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 4.2203, df = 38, p-value = 0.0001461
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.3075230 0.7452742
sample estimates:
      cor 
0.5649156 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha8-split14 14 8 0.8143167 0.8127486 1 4
alpha6-split4 4 6 0.8143115 0.8134132 2 1
alpha4-split6 6 4 0.8139044 0.8125188 3 7
alpha6-split6 6 6 0.8138320 0.8120060 4 13
alpha10-split1 1 10 0.8136804 0.8096372 5 39
alpha4-split14 14 4 0.8135519 0.8127105 6 5
alpha8-split1 1 8 0.8135507 0.8123007 7 9
alpha4-split1 1 4 0.8134793 0.8116659 8 16
alpha2-split10 10 2 0.8134116 0.8130130 9 3
alpha6-split14 14 6 0.8133512 0.8133512 10 2
alpha8-split6 6 8 0.8133146 0.8123865 11 8
alpha8-split2 2 8 0.8131659 0.8110863 12 26
alpha6-split2 2 6 0.8130816 0.8107886 13 31
alpha2-split2 2 2 0.8130666 0.8113562 14 19
alpha10-split2 2 10 0.8130164 0.8120995 15 12
alpha4-split8 8 4 0.8129769 0.8113510 16 20
alpha8-split4 4 8 0.8128976 0.8122780 17 10
alpha2-split1 1 2 0.8128582 0.8107463 18 32
alpha6-split1 1 6 0.8127984 0.8119043 19 14
alpha10-split8 8 10 0.8127049 0.8113283 20 21
alpha10-split4 4 10 0.8126153 0.8126153 21 6
alpha4-split10 10 4 0.8126122 0.8104677 22 35
alpha2-split6 6 2 0.8125260 0.8106311 23 33
alpha6-split8 8 6 0.8124979 0.8112956 24 22
alpha4-split2 2 4 0.8124659 0.8096562 25 38
alpha8-split8 8 8 0.8124494 0.8118651 26 15
alpha4-split4 4 4 0.8124026 0.8108674 27 30
alpha10-split14 14 10 0.8123368 0.8085520 28 40
alpha6-split10 10 6 0.8122660 0.8110387 29 28
alpha10-split10 10 10 0.8122348 0.8110557 30 27
alpha6-split12 12 6 0.8121809 0.8121731 31 11
alpha8-split12 12 8 0.8121783 0.8116619 32 17
alpha4-split12 12 4 0.8121086 0.8109392 33 29
alpha8-split10 10 8 0.8120596 0.8111544 34 24
alpha10-split6 6 10 0.8120450 0.8110999 35 25
alpha2-split8 8 2 0.8119572 0.8114659 36 18
alpha2-split14 14 2 0.8118799 0.8112868 37 23
alpha2-split4 4 2 0.8117333 0.8103575 38 36
alpha2-split12 12 2 0.8116948 0.8098785 39 37
alpha10-split12 12 10 0.8114115 0.8105080 40 34

N2

CM = 'N2'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 6 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 4544, p-value = 0.0001434
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.5737336 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 4.3181, df = 38, p-value = 0.0001087
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.3192886 0.7510184
sample estimates:
      cor 
0.5737336 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha10-split4 4 10 0.8160521 0.8155076 1 1
alpha4-split14 14 4 0.8159137 0.8142264 2 9
alpha6-split6 6 6 0.8158485 0.8141548 3 11
alpha8-split2 2 8 0.8157589 0.8118531 4 35
alpha10-split14 14 10 0.8156174 0.8144956 5 8
alpha4-split2 2 4 0.8154030 0.8141144 6 13
alpha2-split14 14 2 0.8153938 0.8146303 7 6
alpha8-split8 8 8 0.8153803 0.8135523 8 17
alpha6-split14 14 6 0.8153079 0.8145188 9 7
alpha10-split10 10 10 0.8151383 0.8141803 10 10
alpha2-split4 4 2 0.8151351 0.8141436 11 12
alpha4-split6 6 4 0.8150082 0.8136813 12 15
alpha2-split1 1 2 0.8149727 0.8149727 13 2
alpha6-split1 1 6 0.8149647 0.8117232 14 36
alpha8-split14 14 8 0.8149394 0.8149394 15 3
alpha4-split1 1 4 0.8148042 0.8126403 16 25
alpha6-split8 8 6 0.8147887 0.8136980 17 14
alpha10-split12 12 10 0.8147740 0.8112106 18 39
alpha6-split12 12 6 0.8147239 0.8147239 19 4
alpha6-split2 2 6 0.8147189 0.8120959 20 30
alpha8-split12 12 8 0.8147156 0.8147156 21 5
alpha10-split2 2 10 0.8146328 0.8131231 22 18
alpha8-split1 1 8 0.8146139 0.8118607 23 34
alpha8-split4 4 8 0.8144858 0.8128927 24 22
alpha10-split1 1 10 0.8143625 0.8125176 25 26
alpha4-split4 4 4 0.8143142 0.8120885 26 31
alpha8-split10 10 8 0.8142692 0.8131137 27 19
alpha4-split12 12 4 0.8142046 0.8126581 28 24
alpha2-split6 6 2 0.8141201 0.8136465 29 16
alpha6-split4 4 6 0.8141097 0.8121154 30 29
alpha2-split8 8 2 0.8141088 0.8120804 31 33
alpha6-split10 10 6 0.8140471 0.8121456 32 28
alpha10-split6 6 10 0.8139816 0.8113272 33 38
alpha10-split8 8 10 0.8139193 0.8128859 34 23
alpha4-split10 10 4 0.8139160 0.8116452 35 37
alpha2-split12 12 2 0.8136381 0.8120813 36 32
alpha2-split2 2 2 0.8135764 0.8111991 37 40
alpha2-split10 10 2 0.8135176 0.8121959 38 27
alpha4-split8 8 4 0.8133975 0.8128988 39 20
alpha8-split6 6 8 0.8133966 0.8128947 40 21

TD_U

CM = 'TD_U'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 7 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 3744, p-value = 1.002e-05
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.6487805 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 5.2556, df = 38, p-value = 5.961e-06
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.4226991 0.7988428
sample estimates:
      cor 
0.6487805 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha2-split1 1 2 0.8139074 0.8122429 1 11
alpha10-split4 4 10 0.8138828 0.8126534 2 8
alpha10-split14 14 10 0.8138316 0.8132249 3 1
alpha2-split14 14 2 0.8134444 0.8118927 4 15
alpha8-split4 4 8 0.8133810 0.8128080 5 6
alpha2-split2 2 2 0.8133703 0.8109724 6 30
alpha2-split10 10 2 0.8133646 0.8117794 7 20
alpha4-split8 8 4 0.8133619 0.8114308 8 24
alpha8-split10 10 8 0.8133121 0.8115951 9 22
alpha4-split12 12 4 0.8133069 0.8118933 10 14
alpha10-split10 10 10 0.8132122 0.8117902 11 19
alpha8-split8 8 8 0.8131945 0.8121143 12 12
alpha8-split14 14 8 0.8131231 0.8131231 13 2
alpha4-split14 14 4 0.8131016 0.8127435 14 7
alpha6-split2 2 6 0.8130705 0.8112600 15 28
alpha2-split6 6 2 0.8129737 0.8129534 16 3
alpha4-split4 4 4 0.8129536 0.8129199 17 4
alpha2-split8 8 2 0.8129273 0.8124501 18 10
alpha10-split12 12 10 0.8129018 0.8129018 19 5
alpha8-split12 12 8 0.8128858 0.8101728 20 34
alpha6-split4 4 6 0.8128580 0.8124940 21 9
alpha4-split6 6 4 0.8128191 0.8117949 22 18
alpha4-split2 2 4 0.8127928 0.8118891 23 16
alpha6-split8 8 6 0.8126086 0.8111967 24 29
alpha6-split12 12 6 0.8125183 0.8105586 25 32
alpha2-split12 12 2 0.8124640 0.8108975 26 31
alpha6-split10 10 6 0.8124298 0.8118318 27 17
alpha2-split4 4 2 0.8123349 0.8112713 28 27
alpha4-split1 1 4 0.8123037 0.8099085 29 36
alpha6-split14 14 6 0.8122679 0.8119238 30 13
alpha6-split6 6 6 0.8120207 0.8113816 31 25
alpha4-split10 10 4 0.8119948 0.8116936 32 21
alpha10-split6 6 10 0.8118973 0.8114637 33 23
alpha8-split6 6 8 0.8117065 0.8104464 34 33
alpha10-split8 8 10 0.8116883 0.8113719 35 26
alpha6-split1 1 6 0.8111772 0.8083185 36 37
alpha8-split1 1 8 0.8110935 0.8077964 37 38
alpha8-split2 2 8 0.8110155 0.8101183 38 35
alpha10-split1 1 10 0.8102222 0.8065404 39 40
alpha10-split2 2 10 0.8097531 0.8075450 40 39

General conclusiones of the analysis per complexity measure

  • Worst results are found for extreme values of the parameters: split = 1 and high alphas or high alpha and split (but the first situation is worst). This is expected since we are splitting the complexity spectrum only in three pieces (with split=1: easy-base-hard) and we are multiplying the easy and the hard case for a really high value ( high alpha). Thus, instead of training with the situation easy-base-hard, we’re training with the situation super easy - base - super hard and, consequently, the complexity spectrum is not correctly covered.

  • For every complexity measure, there are different values of the parameters that offer the best solution. They all have generally in common that intermediate values of alpha and split are better. But there are some exceptions, for example alpha and split equal to 10 works well for Hostility. From this, we can conclude that intermediate length of cycles are better than too short (s=1) or too long (s=10) cycles.

  • We have a total of 40 combinations of alpha and split values. For that 40 cases, the higher accuracy is obtained at the maximum number of models tested (around 300) in 6.5 cases (on average for all the complexity measures). In particular, the maximum accuracy is obtained with the maximum number of tested models in the following cases (out of 40) for each specific complexity measure:

    • CLD: 4 times

    • DCP: 10 times

    • F1: 11 times

    • Hostility: 4 times

    • kDN: 9 times

    • LSC: 3 times

    • N1: 7 times

    • N2: 5 times

    • TDU: 6 times

It is interesting to examine whether the complexity measures for which the highest accuracy is achieved with the maximum number of models tested are those with the worst performance.

  • Regarding the correlation (both in terms of Pearson and Spearman correlation) between the ranking of alpha-split combinations according to the higher accuracy obtained when considering all the ensembles tested and when only considering those number of ensembles for which there are significant differences: in general the correlation is medium - low, indicating that there is no clear agreement between those significant differences and the real maximum. Anyway, we have to compare what are the real differences in terms of if it is really worthy to keep training more and more models for maybe an increase of 0.001 in accuracy (differences are not significant and all graphs have shown that the accuracy is moving always in a close interval of values). In particular:

    • CLD: corr 0.388

    • DCP: corr = 0.72

    • F1: corr = 0.31

    • Hostility: corr = 0.449

    • kDN: corr = 0.471

    • LSC: corr = 0.5679

    • N1: corr = 0.561

    • N2: corr = 0.589

    • TDU: corr = 0.687

      • To deal with this, we are going to compare our method with SOTA methods in two different situations:

        • Considering all the ensembles tested

        • Considering only those where there are significant differences